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An  algorithm  is  presented  for  labeling  the  connected 
components  of  an  image  represented  by  a  quadtree.  The  algo¬ 
rithm  proceeds  by  exploring  all  possible  adjacencies  for  each 
node  once  and  only  once.  Once  this  is  done,  any  equivalences 
generated  by  the  adjacency  labeling  phase  are  propagated. 
Analysis  of  the  algorithm  reveals  that  its  worst  case  average 
execution  time  is  bounded  by  a  quantity  proportional  to  the 
product  of  the  log  of  the  region's  diameter  and  the  number  of 
blocks  comprising  the  area  |spanned  by  the  components. 
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1.  Introduction 

Connected  component  labeling  is  a  basic  operation  in  image 
processing  [RK].  The  standard  labeling  algorithms  use  either 
an  array  or  run-length  representation  for  the  two-level 
("binary")  image  whose  components  are  to  be  labeled.  In  this 
paper  we  present  an  algorithm  for  labeling  the  connected  comp¬ 
onents  of  l's  in  a  binary  image  that  is  represented  by  a  quad¬ 
tree  ( [Klinger, DRS,Sametl]) . 

We  assume  that  the  given  binary  image  is  a  2n  by  2n  array 
of  unit  square  "pixels."  The  quadtree  is  an  approach  to  image 
representation  based  on  successive  subdivision  of  the  array  into 
quadrants.  In  essence,  we  repeatedly  subdivide  the  array  into 
quadrants,  subquadrants,...,  until  we  obtain  blocks  (possibly 
single  pixels)  which  consist  entirely  of  either  l's  or  0's.  This 
process  is  represented  by. a  tree  of  out  degree  4  in  which  the 
root  node  represents  the  entire  array.  The  four  sons  of  the  root 
node  represent  the  quadrants,  and  the  terminal  nodes  correspond 
to  those  blocks  of  the  array  for  which  no  further  subdivision  is 
necessary.  For  example.  Figure  lb  is  a  block  decomposition  of 
the  region  in  Figure  la  while  Figure  lc  is  the  corresponding 
quadtree.  In  general,  BLACK  and  WHITE  square  nodes  represent 
nodes  consisting  entirely  of.  l's  and  of  0's,  respectively.  Cir¬ 
cular  nodes,  also  termed  GRA'/  nodes,  denote  non-terminal  nodes. 


Sections  2-6  present  and  analyze  our  algorithm.  Included 
is  a  formal  description  of  the  algorithm  along  with  motivating 
considerations.  The  actual  algorithm  is  given  using  a  variant 
of  ALGOL  60  [Naur]. 
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2 .  Definitions  and  Notation 

Let  each  node  in  a  quadtree  be  stored  as  a  record  containing 
seven  fields.  The  first  five  fields  contain  pointers  to  the 
node's  father  and  its  four  sons  labeled  NW,  HE,  SE ,  and  SW. 

Give  a  node  P  and  a  son  I,  these  fields  are  referenced  as 
FATHER (P)  and  SON(P,I)  respectively.  At  times  it  is  useful',  to 
use  the  function  SONTYPE(P)  where  SONTYPE(P)  -  Q  iff  SON (FATHER (P) , 
Q)  =  P.  The  sixth  field,  named  NODETYPE,  describes  the  contents 
of  the  block  of  the  image  which  the  node  represents — i.e.,  WHITE, 
if  the  block  contains  no  l's;  BLACK,  if  the  block  contains  only 
l's,  and  GRAY,  if  it  contains  pixels  of  both  types.  Alternatively, 
BLACK  and  WHITE  nodes  are  terminal  nodes  while  GRAY  nodes  are 
non-terminal  nodes.  The  seventh  field,  named  REGION,  identifies 
the  connected  component  containing  the  block  represented  by  the 
node.  This  field  is  only  meaningful  for  BLACK  nodes.  It  is  set 
as  a  result  of  the  connected  component  labeling  algorithm. 

LABELED (P)  indicates  if  node  P  has  already  been  labeled. 

Let  the  four  sides  of  a  node's  block  be  called  its  N,  E,  S, 
and  W  sides.  They  are  also  termed  its  boundaries.  The  inter¬ 
relationship  between  a  block's  four  quadrants  and  its  boundaries 
is  facilitated  by  use  of  the  predicate  ADJ  and  the  function  REFLECT 
AD J ( B , I )  is  true  if  and  onlj  ;f  quadrant  I  is  adjacent  to 
boundary  B  of  the  node's  block;  e.g.,  ADJ(N,NE)  is  true. 

REFLECT (B, I)  yields  the  quadrant  which  is  adjacent  to  quadrant  I 


along  boundary  B  of  the  block  represented  by  I;  e.g., 
REFLECT (W,NW)=NE,  REFLECT (E ,NW) =NE ,  REFLECT (N ,NW) =SW,  and 
REFLECT (S,NW)=SW.  Figure  2  shows  the  relationship  between 
the  quadrants  of  a  node  and  its  boundaries. 

Given  a  quadtree  corresponding  to  a  2n  by  2n  array,  we 
say  that  the  root  node  is  at  level  n,  and  that  a  node  at 
level  i  is  at  a  distance  of  n-i  from  the  root  of  the  tree. 
In  other  words,  for  a  node  at  level  i,  we  must  ascend  n-i 
FATHER  links  to  reach  the  root  of  the  tree.  Note  that  the 
farthest  node  from  the  root  of  the  tree  is  at  level  ^0. 
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Figure  2.  Relationship  between  a  block's  four  quadrants 
and  its  boundaries . 


3 .  Informal  description  of  the  algorithm 

The  connected  component  labeling  algorithm  has  three  phases. 
The  first  phase  traverses  the  tree  and  explores  all  possible 
adjacencies  between  pairs  of  BLACK  nodes.  During  this  process 
all  BLACK  nodes  are  labeled.  Should  any  equivalences  be  dis¬ 
covered  between  regions  already  labeled,  then  their  component 
identifiers  are  added  to  a  list  of  pairs  of  equivalences. 

Once  the  entire  tree  is  traversed  in  this  manner,  the  second 
phase  processes  pairs  of  equivalences  to  yield  equivalence 
classes  (e.g.  ,  [Knuth,  Tar jan] ) .  Finally,  the  third  phase  tra¬ 
verses  the  tree  one  more  time  with  all  members  of  an  equivalence 
class  being  assigned  the  same  component  identifier  (i.e.,  label). 

Phase  one  traverses  the  tree  in  postorder  (i.e.,  the  sons 
of  a  node  are  visited  first).  In  particular,  the  sons  are  visited 
in  the  order  NW,  NE,  SW,  and  SE.  For  each  BL^CK  terminal  node, 
say  P,  we  explore  the  eastern  and  southern  adjacencies.  This 
means  that  all  of  the  node's  BLACK  adjacertt  southern  and  eastern 
neighbors  are  visited.  If  they  have  not  been  previously  visited, 
then  they  are  labeled  with  the  label  of  P.  If  P  does  not  already 


have  a  label,  then  it  is  assigned  the  label  of  one  of  its  adja¬ 
cent  neighbors  if  it  has  a  label.  If  adjacent  BLACK  nodes  have 
already  been  assigned  labels  that  are  different,  then  the  Labels 
are  added  to  the  list  of  equivalences  that  will  be  merged  in  the 
second  phase. 


The  key  to  the  algorithm  is  that  phase  one  assures  that 
every  adjacency  of  two  BLACK  nodes  will  be  explored  once  and 
only  once.  To  see  this,  note  that  the  traversal  starts  at 
the  NW-most  son,  if  possible,  and  the  brothers  are  traversed 
in  the  order  NW,  NE,  SW,  and  SE.  Clearly,  by  the  time  any 
BLACK  node  is  visited,  its  northern  and  western  adjacencies 
have  already  been  explored.  Thus  the  northern  and  western  ad¬ 
jacencies  need  not  be  reexplored.  This  is  because  each  node 
labels  all  of  its  adjacent  eastern  and  southern  neighbors. 

Note  the  analogy  between  phase  one  and  the  algorithm  for 
computing  the  total  perimeter  of  an  image  represented  by  a 
quadtree  [Samet2].  When  computing  perimeter  we  must  explore 
adjacencies  of  BLACK  and  WHITE  nodes  rather  than  adjacencies  of 
BLACK  and  BLACK  nodes.  Besides  the  duality  in  the  type  of  adja¬ 
cency,  there  is  only  one  other  difference.  The  perimeter  comp¬ 
utation  algorithm  requires  that  adjacencies  in  four  directions 
need  to  be  explored  whereas  adjacencies  in  only  two  directions 
need  to  be  explored  in  phase  one.  Four  directions  were  necessary 
in  the  perimeter  computation  algorithm  because  for  each  pair  of 
adjacent  BLACK  and  WHITE  nodes  only  the  BLACK  node  causes  the 
adjacency  to  be  explored  (WHITE  nodes  do  not) . 

As  an  example  of  the  application  of  the  algorithm,  consider 
the  image  given  in  Figure  la.  Figure  lb  is  the  corresponding 
block  decomposition  and  Figure  lc  is  its  quadtree  representation. 


A!  J.  of  the  BLACK  nodes  have  numbers  ranging  between  1  and  41 
while  the  WHITE  nodes  have  numbers  ranging  between  42  and  91. 
The  BLACK  nodes  have  been  numbered  in  the  order  in  which  they 
were  labeled  by  phase  one.  The  WHITE  nodes  have  been  numbered 
in  the  order  in  which  they  were  visited  (i.e.,  the  argument  to 
procedure  LABEL).  Thus  node  1  has  been  labeled  before  nodes  2, 
3,  etc.  Figure  3  shows  the  labels  assigned  to  the  five  compon¬ 
ents.  Phase  two  of  the  algorithm  will  merge  the  equivalence 
pair  B=D  to  form  component  4  and  the  equivalence  pairs  FHG 
and  G=H  to  form  component  5 . 


Formal  statement  of  the  algorithm 

The  following  ALGOL- like  procedures  specify  the  connected 
component  labeling  algorithm.  Actually,  we  only  present  the 
procedures  corresponding  to  the  first  and  third  phases  of  the 
algorithm.  Phase  two  can  be  achieved  by  using  a  variant  of 
algorithm  E  in  [Knuth] . 

The  main  procedure  is  termed  COMPONENT  and  is  invoked  with 
a  pointer  to  the  root  of  the  quadtree  representing  the  image . 

The  global  variable  MERGES  is  used  to  accumulate  all  of  the 
equivalence  relations  formed  by  adjacent  BLACK  ndoes.  MERGES 
is  subsequently  processed  by  phase  two  to  yield  a  set  of  equiv¬ 
alence  classes — i.e.,  one  class  per  component.  LABEL  implements 
phase  one  by  traversing  the  tree  and  controlling  the  exploration 
of  adjacent  BLACK  nodes.  F IND_NE I GHBOR  locates  a  neighboring 
node  of  greater  or  equal  size  along  a  specified  border.  If  no 
such  neighboring  BLACK  or  WHITE  node  exists,  then  FIND_NEIGHBOR 
returns  a  pointer  to  a  GRAY  node  of  equal  size.  In  such  a  case, 
procedure  LABEL_ADJACENT  continues  the  search  recursively  by 
examining  all  BLACK  and  WHITE  adjacent  neighbors  of  smaller  size. 
Otherwise,  LABEL_ADJACENT  assigns  a  label  to  the  adjacent  neighbor 
if  it  is  BLACK.  The  labels  are  assigned  by  procedure  ASSIGNLABEL . 
Procedure  UPDATE  corresponds  to  phase  three  and  results  in  the 
traversal  of  the  tree  in  order  to  propagate  the  equivalences 
thereby  uniquely  labeling  each  component. 


procedure  COMPONENT (QUADTREE) ; 

/*  label  all  of  the  connected  ccnponents  of  the  tree  rooted  at  QUADTREE*/ 
begin 

node  QUADTREE; 
pairlist  MERGES; 

MERGES-*- empty ; 

LABEL (QUADTREE) ; 

process  equivalences  specified  by  MERGES; 

UPDATE (QUADTREE) ; 

end; 


procedure  LABEL (P) ; 

/♦assign  labels  to  node  P  and  its  sons*/ 
begin 

node  P,Q; 
quadrant  I; 
if  GRAY (P)  then 
begin 

for  I  in  { NW , NE , SW , SE }  do  LABEL (SON (P , I) ) ; 

end 

else  if  BLACK (P)  then 
begin 


Q+-FIND_NEIGHBOR  (P  ,  '  E ' )  ; 

if  not  NULL (Q)  then  LABEL_ADJACENT (Q / 'NW ' , 'SW' ,P) 
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Q«-FIND_NEIGHBOR  (P ,  '  S  ' )  ; 

if  not  NULL (Q)  then  LABEL_ADJACENT(Q, 'NW' , 'NE '  ,P)  ; 
if  not  LABELED (P)  then  REGION (P)  GENREGION (  ); 

end 

else  return;  /*a  WHITE  node  */ 

end; 


node  procedure  PlND_NEIGHBOR (P , S ) ; 

/*  given  node  P,  return  a  node  which  is  adjacent  to  side  S  of  node  P*/ 
begin 

node  P,Q; 
side  S ; 

if  not  NULL (FATHER (P) )  and  AD J ( S , SONTYPE ( P ) )  then 
/*find  a  common  ancestor*/ 

Q«-FIND_NEIGHBOR  (FATHER  (P)  ,S) 
else  Q<-FATHER(P)  ; 

/*  follow  reflected  path  back  to  locate  the  neighbor  * / 
return  (if  not  NULL (Q)  and  GRAY (Q)  then  SON (Q, REFLECT (S , SONTYPE (P) ) ) 
else  Q) ; 


\ 


end; 
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procedure  LABEL_ADJACENT (R,Q1,Q2,P) ; 

/*  find  all  descendants  of  node  R  adjacent  to  node  P~i.e.,  in 
quadrants  Ql  and  Q2  */ 
begin 

node  P,R; 
quadrant  Q1,Q2; 
if  GRAY ( R)  then 
begin 

LABEL_AD JACENT ( SON ( R , Ql ) ,Ql,Q2,P) ; 

LABEL_AD JACENT ( SON ( R , Q2 ) ,Ql,Q2,P) ; 

end 

else  if  BLACK (R)  then  ASSIGN_LABEL (P ,R) 
else  return;  /*  a  WHITE  node  * / 

end; 


procedure  ASSIGN_LABEL(P,Q) ; 

/*  assign  a  label  to  nodes  P  and  Q  if  they  do  not  already  have 

one.  If  both  have  different  labels,  then  enter  them  in  MERGES  */ 
begin 

node  P,Q; 

if  LABELED (P)  and  LABELED (Q)  then 
begin 

if  REGION (P) ^REGION (Q)  then  add  <REGION(P) , REGION (Q) > 
to  MERGES; 

end 

else  if  LABELED  (P)  then  REGION  (Q)  ■«- REGION  (P) 
else  if  LABELED (Q)  then  REGION (P) ^REGION (Q) 
else  REGION  (P)-«- REGION  (Q)  -*-GENREGION  (  ); 


end; 


The  running  time  of  the  connected  component  labeling  algo¬ 
rithm  is  determined  by  the  time  necessary  to  execute  its  three 
phases.  Prior  to  analyzing  this  value  we  first  examine  the 
spatial  configurations  confronted  by  the  algorithm  and  how  they 
affect  its  execution  time.  It  should  be  clear  that  the  greater 
the  number  of  BLACK  nodes,  the  more  time  is  spent  exploring  ad¬ 
jacencies  in  phase  one.  Phase  two  is  more  dependent  on  the 
shape  of  the  various  components .  The  execution  time  of  phase 
two  is  dominated  by  the  number  of  equivalence  pairs  that  are 
generated  in  phase  one.  An  equivalence  pair  is  generated  when¬ 
ever  an  adjacency  of  a  previously  labeled  node  is  explored  and 
it  is  found  that  the  adjacent  neighbor  has  already  been  assigned 
a  different  label. 

The  situation  giving  rise  to  the  generation  of  an  equivalence 
pair  can  be  best  seen  by  examining  Figure  4.  Components  1,  2, 
and  3  do  not  result  in  the  generation  of  equivalence  pairs 
because  of  the  manner  in  which  phase  one  explores  adjacencies — 
i.e.,  in  the  eastern  and  southern  directions  thereby  processing 
the  quadrants  in  the  order  NW,  NE,  SW,  SE.  Thus  we  see  that  the 
nodes  or  blocks  comprising  the  quadtree  are  processed  in  the  order 
in  which  they  are  adjacent.  This  is  not  always  the  case  for  a 
component  having  the  form  of  component  4  in  Figure  4  (in  this 
case  we  have  the  equivalence  of  D  and  E) .  This  is  especially 


true  when  the  vertical  and  horizontal  segments  are  not  comprised 
of  single  blocks,  or  have  adjacent  northern  or  western  neighbors 
in  the  case  of  the  horizontal  segment.  For  example,  in  the  image 
represented  by  Figure  la  we  find  that  no  equivalence  pairs  were 
generated  for  the  components  1,  2,  and  3  whereas  this  was  not 
true  for  the  components  4  and  5.  In  particular,  we  have  the 
equivalences  B=D  for  component  4  and  F=G  and  G=H  for  component 
5.  Note  that  if  the  block  labeled  40  would  have  been  WHITE 
rather  than  BLACK,  then  block  41  would  have  been  labeled  with 
G  and  no  equivalence  pair  would  have  been  generated . 

Phase  one  depends  on  the  speed  of  the  combination  of  proce¬ 
dures  LABELADJACENT  and  FIND_NEIGHBOR.  These  procedures  are  in¬ 
voked  in  phase  one  (i.e.,  in  procedure  LABEL)  twice  as  many 
times  as  one  has  BLACK  nodes.  The  actual  amount  of  work  per¬ 
formed  by  these  procedures  is  more  accurately  represented  by 
considering  the  number  of  nodes  that  are  visited  when  an  adja¬ 
cency  is  being  explored.  Recall  that  we  must  find  the  neighbor, 
and  if  it  is  GRAY,  then  visit  all  adjacent  neighbors  of  a  smaller 
size.  In  the  worst  case,  we  are  at  level  n-1,  with  a  GRAY  neigh¬ 
bor,  and  all  adjacent  neighbors  at  level  0.  In  such  a  case,  we 
must  visit  2n  nodes.  For  example,  consider  Figure  5  where  n=3 
and  we  wish  to  visit  the  blocks  adjacent  to  the  block  labeled  A 
(i.e.,  blocks  B,  C,  D,  and  E) .  We  must  visit  the  root  of  the 
quadtree  as  well  as  A's  neighboring  GIJAY  node  and  all  of  its  NW 
and  SW  sons — i.e.,  a  complete  binary  tree  of  height  2.  In  total. 


2  3  =  8  nodes  are  visited.  In  general,  let  the  space  be  parti¬ 


tioned  into  a  2n  by  2n  array.  Assume  a  random  image — i.e.,  a 
BLACK  node  is  equally  likely  to  appear  in  any  position  and  level 
in  a  quadtree.  Recalling  the  analogy  drawn  in  Section  3  between 
phase  one  and  the  perimeter  computation  algorithm  we  have  the 
following  result: 

Theorem  1 :  The  average  of  the  maximum  number  of  nodes  visited 
by  LABEL_ADJACENT  is  n+1. 

Proof :  See  Theorem  1  in  [Samet2]. 

The  speed  of  phase  two  of  the  algorithm  depends  on  the 
method  used  for  processing  equivalence  relations  and  on  the 
number  of  pairs  of  equivalences  and  different  objects  of  the  set 
on  which  the  equivalences  are  defined.  We  use  a  variant  of  an 
algorithm  presented  in  [Knuth].  Its  maximum  execution  time  is 
proportional  to  the  square  of  the  number  of  pairs  of  equivalences. 

It  is  speeded  up  by  using  a  modification  due  to  D.  Mcllroy 
[Knuth]  to  have  a  maximum  proportionality  to  the  product  of  t  .e 
number  of  pairs  of  equivalences  and  the  log  of  the  size  of  the 
set  on  which  the  equivalences  are  defined. 

Recall  that  equivalence  pairs  are  generated  during  phase  one 
only  when  we  are  exploring  the  adjacencies  of  a  node  that  is 
already  labeled  and  its  neighbor  has  also  been  labeled  before, 
albeit  with  a  different  label.  We  now  prove  the  following  lemma: 

.  _  . _ .  4 


Lemma  1 :  Phase  one  generates  a  maximum  of  one  equivalence  pair 


for  each  adjacency  that  is  explored  (i.e.,  each  call  to  proce¬ 
dure  LABEL  explores  two  adjacencies) . 

Proof ;  There  are  two  cases  depending  on  the  direction  of  the 
adjacency. 

Case  (a) :  An  adjacency  in  the  eastern  direction  can  yield  at 
most  one  equivalence  pair  regardless  of  the  size  of 
the  neighbor.  This  is  clearly  true  if  the  neighbor 
is  larger  (e.g.,  blocks  38  and  35  in  Figure  lb). 
Similarly,  if  the  neighbor  is  smaller,  then  only  the 
northernmost  such  neighbor  could  have  been  previously 
labeled  (e.g.,  blocks  12  and  20  in  Figure  lb)  because 
only  it  could  have  been  the  southern  neighbor  of  a 
previously  labeled  node. 

Case  (b) :  An  adjacency  in  the  southern  direction  can  only  yield 
an  equivalence  pair  if  the  neighbor  is  larger.  No 
equivalence  pair  may  result  if  the  neighbor  is  smaller. 
This  should  be  clear  since  southern  neighbors  could 
only  have  been  visited  if  they  are  adjacent  to  a  west¬ 
ern  neighbor  which  has  been  visited  previously. 

Q.E.D. 

Letting  B  denote  the  number  of  BLACK  nodes  we  have  the  following 
theorem: 

Theorem  2 :  2B  log  B  is  an  upper  bound  on  the  execution  time  of 


phase  two. 


Proof :  By  the  above  lemma,  phase  one  generates  a  maximum  of 


one  equivalence  pair  for  each  adjacency  that  is  explored.  Re¬ 
call  that  phase  one  explores  two  adjacencies  for  each  BLACK 
node.  Also  the  set  on  which  the  equivalence  pairs  are  defined 
has  a  maximum  number  of  objects  equal  to  the  number  of  BLACK 
nodes,  i.e.,  B.  Thus  when  the  equivalence  merging  algorithm 
of  [Knuth]  is  used,  one  has  2B  log  B  as  the  upper  bound  for  the 
execution  time. 

Q .  E .  D  • 

Clearly,  B  *  3*2  “  since  at  most  3  of  every  4  sons  of  a 

node  at  level  1  in  a  complete  quadtr'-^.  are  BLACK.  At  this  point 
we  show  how  the  upper  bound  of  Theorem  2  may  be  tightened.  As¬ 
sume  a  2n  by  2n  array  and  n  a  2.  We  first  prove  the  following 
lemma . 

Lemma  2:  2^n”^  is  an  upper  bound  on  the  number  of  objects  in 

the  set  upon  which  the  equivalences  are  generated  in  phase  one. 
Proof :  Recall  our  discussion  with  respect  to  the  types  of  equiv¬ 
alences  our  algorithm  is  least  proficient  at  handling.  Clearly, 
the  best  algorithm  is  one  that  never  generates  pairs  of  equiva¬ 
lences.  We  mentioned  the  existence  of  a  worst  case  configuration 
of  nodes  where  an  equivalence  pair  had  to  be  generated  (see  Comp¬ 
onent  1  of  Figure  4) .  Clearly,  the  upper  bound  of  Theorem  1 
arises  when  the  quadtree  corresponds  to  a  single  component — i.e., 
once  all  of  the  pairs  of  equivalences  are  merged,  a  single  equiv¬ 
alence  class  results.  Therefore,  consider  a  case  where  the 
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minimal  instance  of  the  worst  case  case  is  replicated — e.g., 
the  2  by  2  array  in  Figure  6.  In  the  general  case,  i.e.,  a 
2n  by  2n  array,  there  are  2n  ^  objects  on  which  equivalence 
pairs  are  generated  for  the  first  four  consecutive  rows  and 
2n-1-l  such  objects  for  each  remaining  set  of  four  consecutive 
rows.  Thus  in  a  2n  by  2n  array  there  are  less  than  22n~3  objects 
in  the  set  upon  which  equivalence  pairs  are  generated  (e.g. ,  1 
through  29  in  Figure  6) . 

Q.E.D. 

We  can  now  prove  Theorem  2'  as  follows: 

Theorem  2 1 :  2B(2n-3)  is  an  upper  bound  on  the  execution  time 

of  phase  two. 

Proof:  From  Lemma  2  we  have  that  the  maximum  number  of  objects 

in  the  set  upon  which  the  equivalences  are  generated  in  phase 
one  is  less  than  2  ~  .  From  Theorem  2  and  [Knuth]  the  execution 

time  of  phase  two  is  bounded  by  the  product  of  2B  and  the  log  of 
the  maximum  number  of  objects  in  the  set  upon  which  the  equivalence 
pairs  are  derived — i.e.,  2B  log  2  =  2B(2n-3). 

Q.E.D. 

The  speed  of  phase  three  of  the  algorithm  can  be  obtained  in 
a  straightforward  way.  The  tree  must  be  traversed  and  for  each 
BLACK  node  P,  REGION (P)  must  be  set  to  the  head  of  the  equivalence 
class  obtained  as  a  result  of  phase  two.  The  actual  lookup  oper¬ 
ation  is  bounded  by  the  log  of  the  maximum  number  of  objects  in 


the  set  upon  which  the  equivalences  were  generated  in  phase  one. 
From  Lemma  2  we  have  the  value  2n-3.  An  upper  bound  on  the  size 
of  the  tree  is  obtained  by  the  following  lemma. 

Lemma  3i  The  upper  bound  on  the  number  of  nodes  of  the  quadtree 
is  4Bn+l. 

Proof :  See  Lemma  1  in  [Samet2],  Q.E.D. 

We  now  have  the  following  theorem: 

Theorem  3:  The  upper  bound  of  the  execution  time  of  phase  three 
is  proportional  to  B(2n-3)  +  4Bn+l. 

Proof :  Use  Lemmas  2  and  3  and  the  time  required  to  access  the 
head  of  an  equivalence  class. 

Q.E.D. 

Using  Lemma  3  we  obtain  an  upper  bound  on  the  average  worst  case 
execution  time  of  phase  one. 

Theorem  4 :  The  upper  bound  on  the  average  worst  case  execution 
time  of  phase  one  is  6Bn+2B+l. 

Proof :  From  Theorem  1  we  have  that  for  each  adjacency  involving 
a  BLACK  node,  phase  one  results  in  an  average  worst  case  of  n+1 
nodes  being  visited.  There  are  two  adjacencies  for  each  BLACK 
node.  Also  from  Lemma  3  we  have  that  the  tree  traversal  componen*- 
of  phase  one  visits  at  most  4Bn+l  nodes.  Therefore,  we  have 
2B(n+l)  +  4Bn+l  =  6Bn+2B+l  nodes  being  visited. 


Q.E.D. 


At  this  point  we  come  to  the  main  result: 

Theorem  5:  The  average  worst  case  execution  time  of  phases  one, 
two,  and  three  has  an  upper  bound  proportional  to  the  product 
of  the  number  of  BLACK  nodes  and  the  log  of  the  diameter  of 
the  image. 

Proof :  The  log  of  the  diameter  of  the  image  is  n.  Summing  up 

the  contributions  of  phases  one,  two,  and  three  as  indicated  by 
Theorems  4,  2',  and  3,  we  have  the  result  6Bn+2B+l  +  3B(2n-3)  + 
4Bn+l  =  16Bn-7B+2. 

Q.E.D. 


i> .  Concluding  remarks 


An  algorithm  has  been  presented  for  labeling  the  connected 
components  of  a  binary  image.  The  analysis  was  somewhat  faci¬ 
litated  by  the  analogy  with  the  perimeter  computation  algorithm 
represented  by  a  quadtree.  The  algorithm's  running  time  has 
been  shown  to  have  an  average  worst  case  time  complexity  propor¬ 
tional  to  the  product  of  the  log  of  the-:  image's  diameter  and  the 
*  number  of  the  terminal  nodes  describing  the  area  spanned  by  the 

components.  This  is  of  the  same  order  of  magnitude  as  the 
complexity  of  the  perimeter  computation  algorithm  in  [Samet2] 
although  it  should  be  clear  that  the  latter  is  smaller.  Note  that 
Theorems  2  and  2'  yielded  different  upper  bounds  on  the  execution 
speed  of  phase  two.  We  chose  to  use  the  result  of  Theorem  2' 
because  it  had  a  direct  relationship  to  the  log  of  the  region's 
diameter. 

In  general,  the  performance  of  the  algorithm  is  quite  good 
because  in  actuality  very  few  equivalence  pairs  are  generated. 

.Some  vari  int  of  t  he  worst  case  in  terms  of  configurations  leading 
to  the  generation  of  an  equivalence  pair  will  occur  no  matter 
which  order  of  traversing  the  adjacencies  is  adopted.  It  should 
be  clear  that  phase  two  can  be  combined  with  phase  one  by  per- 
forming  the  merge  dictated  by  the  equivalence  immediately  in 
t  procedure  ASSIGNLABEL  rather  than  using  the  list  MERGES  and 

executing  phase  two.  We  chose  the  previous  approach  in  order  to 
•  simplify  the  presentation  of  the  analysis.  Also  note  that  Lemma  1 
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insures  that  the  upper  bound  of  the  execution  time  of  phase  two 
is  not  affected  by  generating  the  same  equivalence  pairs  more 
than  once  (e.g.,  in  Figure  7  the  equivalence  A=B  is  generated 
once  by  blocks  6  and  2  and  once  by  blocks  9  and  2. 

Note  finally  that  if  the  algorithm  is  applied  to  both  the 
BLACK  and  the  WHITE  nodes,  one  can  compute  the  number  of  holes, 
and  hence  the  genus,  of  the  image.  Genus  can,  of  course,  also 
be  computed  by  counting  the  number  of  occurrences  of  various 
local  patterns  in  the  image. 
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AN  IMAGE/  ITS  MAXIMAL  BLOCKS,  AND  THE  CORRESPONDING  QUADTREE 
BLOCKS  IN  THE  IMAGE  ARE  SHADED. 


QUADTREE  REPRESENTATION  OF  THE  BIOCKS  IN  0>). 
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